Rank in Wordlist | Frequency | Word |
---|---|---|
2953 | 156 | 1,5 |
3242 | 142 | 2,5 |
4930 | 87 | 3,5 |
5579 | 75 | 0,5 |
6535 | 62 | 1,2 |
7598 | 52 | 1,6 |
8375 | 46 | 1,8 |
8695 | 44 | 100,7 |
8697 | 44 | 4,5 |
9419 | 40 | 1,4 |
Rank in Wordlist | Frequency | Word |
---|---|---|
133024 | 1 | .) |
Rank in Wordlist | Frequency | Word |
---|---|---|
2726 | 169 | 100% |
6432 | 63 | 50% |
7368 | 54 | 10% |
8525 | 45 | 20% |
10479 | 35 | 80% |
10946 | 33 | 30% |
11888 | 30 | 90% |
12218 | 29 | 3% |
12565 | 28 | 40% |
13357 | 26 | 1% |
Rank in Wordlist | Frequency | Word |
---|---|---|
3743 | 120 | Villeroy & Boch |
17334 | 19 | GmbH & Co. KG |
19607 | 16 | & Co |
21568 | 15 | natur&ëmwelt |
22331 | 14 | S&P |
26257 | 11 | A&D |
27442 | 11 | Villeroy&Boch |
38390 | 7 | M&A |
38673 | 7 | P&R |
42184 | 6 | Ernst & Young |
Rank in Wordlist | Frequency | Word |
---|---|---|
283199 | 1 | Simeonsti$platzes |
Rank in Wordlist | Frequency | Word |
---|---|---|
344 | 1198 | ." |
Rank in Wordlist | Frequency | Word |
---|---|---|
11341 | 32 | L'essentiel |
15105 | 23 | d'Or |
18500 | 18 | gibt's |
19338 | 17 | d'Lëtzebuerger |
20900 | 15 | Giro d'Italia |
21417 | 15 | d'Italia |
22639 | 14 | d'art |
22708 | 14 | geht's |
24060 | 13 | d'histoire |
29897 | 10 | d'Armes |
Rank in Wordlist | Frequency | Word |
---|---|---|
28785 | 10 | Google + |
50451 | 5 | P+R |
69538 | 3 | DVD+R |
70795 | 3 | FIT+FUN |
72235 | 3 | Gruner + Jahr |
74098 | 3 | Kuehne+Nagel |
86773 | 2 | 1+2 |
86774 | 2 | 1+5 |
87692 | 2 | 4+5 |
87962 | 2 | 6+8+10 |
Rank in Wordlist | Frequency | Word |
---|---|---|
2476 | 188 | l/m² |
2761 | 167 | Esch/Alzette |
3009 | 154 | und/oder |
3343 | 138 | km/h |
7138 | 57 | https://www |
8519 | 45 | 1/2 |
8775 | 44 | Neues/ |
12313 | 29 | Ihr/e/n |
13159 | 27 | SmAdv/Mag |
14113 | 25 | g/m² |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots